A Geometric Approach for Partitioning N-Dimensional Non-rectangular Iteration Spaces
نویسندگان
چکیده
Parallel loops account for the greatest percentage of program parallelism. The degree to which parallelism can be exploited and the amount of overhead involved during parallel execution of a nested loop directly depend on partitioning, i.e., the way the different iterations of a parallel loop are distributed across different processors. Thus, partitioning of parallel loops is of key importance for high performance and efficient use of multiprocessor systems. Although a significant amount of work has been done in partitioning and scheduling of rectangular iteration spaces, the problem of partitioning of non-rectangular iteration spaces e.g. triangular, trapezoidal iteration spaces has not been given enough attention so far. In this paper, we present a geometric approach for partitioning N-dimensional non-rectangular iteration spaces for optimizing performance on parallel processor systems. Speedup measurements for kernels (loop nests) of linear algebra packages are presented.
منابع مشابه
Compile-Time Partitioning of Three-Dimensional Iteration Spaces
This paper presents a strategy for compile-time partitioning of generalised three-dimensional iteration spaces; it can be applied to loop nests comprising two inner nested loops both of which have bounds linearly dependent on the index of the outermost parallel loop. The strategy is analysed using symbolic analysis techniques for enumerating loop iterations which can provide estimates for the l...
متن کاملGenerating efficient tiled code for distributed memory machines
Abstract — Tiling can improve the performance of nested loops on distributed memory machines by exploiting coarse-grain parallelism and reducing communication overhead and frequency. Tiling calls for a compilation approach that performs first computation distribution and then data distribution, both possibly on a skewed iteration space. This paper presents a suite of compiler techniques for gen...
متن کاملMessage-passing code generation for non-rectangular tiling transformations
Tiling is a well known loop transformation used to reduce communication overhead in distributed memory machines. Although a lot of theoretical research has been done concerning the selection of proper tile shapes that reduce processor idle times, there is no complete approach to automatically parallelize non-rectangularly tiled iteration spaces and consequently there are no actual experimental ...
متن کاملPerformance Evaluation of Tiling for the Register Level
Tiling is a well-known loop transformation, which is basically used to expose coarse-grain parallelism and to exploit data reuse at the cache level. However, it can also be used to exploit data reuse at the register level and to improve programs's ILP. Previous work on tiling and also commercial compilers are able to perform tiling for the register level in more than one dimension when the iter...
متن کامل